3574 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Size:
600,000 market comments, 600,000 article titles Production Status:
Newly created-finished
Use:
Text Mining
-
Paper title:Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments
-
Paper track:Short/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chung-Chi Chen | Numeracy-600K | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
License:
Size:
100 MByte Production Status:
Newly created-finished
Use:
Summarisation
-
Paper title:Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
-
Paper track:Long/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Alexander Fabbri | Multi-News | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
Freely Available
License:
Size:
450M sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Learning Deep Transformer Models for Machine Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Qiang Wang | WMT 2016 Translation Task Data | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
4.8 MByte Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:Bias Analysis and Mitigation in the Evaluation of Authorship Verification
-
Paper track:Short/Document Analysis
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Janek Bevendorff | Webis Authorship Verification Corpus 2019 | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
From Data Center(s)
License:
Size:
None Production Status:
Existing-used
Use:
-
Paper title:Neural Architectures for Nested NER through Linearization
-
Paper track:Short/Tagging, Chunking, Syntax and Parsing
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Jana Straková | CoNLL 2003 Shared Task Named Entity data | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
vaires Production Status:
Not Applicable
Use:
Summarisation
-
Paper title:Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation
-
Paper track:Short/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Francine Chen | Stack Exchange Data Dump | /N |
Documentation:
yes, English
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
500 MByte Production Status:
Existing-used
Use:
Summarisation
-
Paper title:Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation
-
Paper track:Short/Summarization
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Francine Chen | DeepMind Q&A Dataset (Stories only) | /N |
Documentation:
yes, English
Written
Corpus,
Language Type:
Bilingual
Languages:
English Romanian
Availability:
Freely Available
License:
Size:
1 GByte Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chenze Shao | wmt16 | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
Freely Available
License:
Size:
2 GByte Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chenze Shao | wmt14 | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
MIT
Size:
15 MByte Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:Target-Guided Open-Domain Conversation
-
Paper track:Long/Dialogue and Interactive Systems
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Zhiting Hu | Target-Guided Open-Domain Conversation | /N |
Documentation:
English




